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ABSTRACT 



The problem of determining the quantity of classrooms, 
laboratories and instructors to train sections of students 
attending numerous distinct courses in a sbhool such as 
the Fleet Ballistic Missile School is considered. 

A procedure is developed for determining feasible 
schedules in order to graduate a fixed number of trainees 
over time while minimizing the cost of facilities mix 
required . 
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I . INTRODUCTION 



The problem of determining the capacity of a school to 
train sections of students attending numerous distinct 
courses has been considered by Willingham [1] as an opti- 
mization problem. He used linear programming techniques 
for determining the maximum number of convenings of each 
type of course taught at the school which can take place 
during a year period subject to resource constraints and 
lower bounds on the number of convenings of each type of 
course . 

This problem arises with the conversion of FBM sub- 
marines from Polaris to Poseidon missile systems. There 
exists the necessity of assuring a proper quantity and mix 
of resources to instruct all required personnel at the Naval 
Guided Missile School. Here the primary resources of inter- 
est are laboratory facilities of various types with their 
associated equipment, classrooms and instructors of each 
specialty pertinent to Poseidon technical training [1]. 

The problem as stated from the point of view of those 
responsible for the funding and operation of the school is: 
"Given the planning estimates from BUPERS of the number of 
personnel who require taining in each type of course over a 
specified time period, determine what level and mix of re- 
sources is adequate to carry out the mission." The present 
v;ork will consider this last aspect of the problem. 



5 



attacking it under the assumption that carefully scheduling 
the convenings of each section will yield to a minimum 
requirement of resources for a given time period. 

The above idea can be expressed more precisely and in 
general as follows: An important function of any school 

planning is to estimate the number of facilities, namely 
laboratories, classrooms and instructors, that are needed 
to support a specified program of Instruction and to specify, 
in general terms, the schedule to obtain the best employment 
of the resources. By a specified program we mean the type 
of courses to be taught, their curricular requirements and 
the quantity of students assigned to each one. The problem 
is, then, to find that schedule which requires the least 
cost set of facilities. 

Since a particular schedule dictates a required set 
of facilities, it is necessary to have a measure of some 
characteristic in order to be able to compare it to other 
feasible schedules. A measure of effectiveness of a sched- 
ule will be defined as the cost involved in the installation 
of all facilities associated with it. 

In general, specifying a measure of effectiveness in 
a scheduling problem is specifying a set of equivalence 
classes of schedules and a preference ordering among these 
classes. For example, specifying the cost of the facilities 
needed as the measure of effectiveness for a particular 
problem means that: 

(a) All schedules that have a cost are equivalent, so 
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one is indifferent as to which schedule is selected. 

(b) A schedule with cost is preferred to a schedule 
with cost if and only if is strictly less than C 2 . 

From the above discussion it can be seen then, that 
the minimal cost scheduling problem is one of evaluating 
schedules and the associated combination of resources in 
the search for the minimum cost. It is clear that there is 
a very large number of possible schedules for the problem. 

It is also clear that most such schedules are uninteresting 
for reasonable measures of performance and only a few are 
worth considering. 

Although the least cost facility mix scheduling problem 
is easy to state and to visualize, no easy solution is 
available. Evaluating the set of all feasible solutions 
(represented by all possible combinations of schedules and 
facilities ) Ms , for practical purposes, a gigantic quest. 

The objective of this paper is to Illustrate a method of 
solving the problem applying the concepts developed in 

[2]. A procedure will be presented for approaching one 
schedule of the equivalence class of schedules that requires 
the least cost of the resources for a given program of 
instruction for a given time period. 

A first attempt to obtain an acceptable solution to 
the scheduling problem can be made by evaluating the costs 
of random schedules vjhich, consequently are random variables 
too, estimating the distribution of the costs, and then 
attempting to estimate the probability that, in a sample of 
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size n, at least one will be in the lowest fraction P of 

X/ 

the population. For a discrete random variable, as will 
be the case here, let p^ be the probability that a single 
observation has a cost value equal to r. Then the proba- 
bility that the smallest member of a sample of size n has 
a cost value less than r is given by: 

-nP 

r-1 

where P. = 2 p., Z.<r. 

‘ i=i ^ ^ 

For obvious reasons it is desirable to have an estimate 
of the lowest cost which is as close as possible to the 
true minimum cost of all possible schedules. Thus P^ must 
represent a very small fraction of the lower tail of the 
distribution. Consequently the individual p^^'s, near the 
extreme are very small thus the probability is small that 
the most extreme cases will be presented in any random 
sample of feasible size [4]. 

If a sufficient amount is known about the population 
distribution such that it is possible to calculate approxi- 
mately the probability that an observation is less than a 
specified value, not far from the extreme of the distribu- 
tion, then the number of further observations necessary to 
improve on this value could be estimated. Hov/ever, before 
an improvement is obtained, a large number of useless 
values may be found. These considerations imply that the 
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rate at which useful information obtained by random sampling 
is much less when estimating extremes than when estimating 
means . 

As in any combinatorial problem the task of evaluating 
the distribution of the cost due to schedules is far from 
being easy and is time consuming. The suggestion is, there- 
fore, to employ some kind of heuristic algorithm on the 
search for the least cost value. If a schedule is gener- 
ated at random and reductions in the cost involved are ob- 
tained by applications of the heuristic mechanism and the 
process is repeated n times, then the n lowest costs gener- 
ated will tend to be a random sample of extreme 'values from 
the n samples originally generated from the parent distri- 
bution. Then a Weibull distribution can be fitted to the 
data. The Weibull has the characteristic of being indepen- 
dent of the parent distribution and has as one of its own 
parameters the minimum value in question, [2], [3]. 

The approach to solve the problem presented in this 
study is first to develop a mathematical model to determine 
the relationships between schedules and facilities mix 
having the cost of the resources needed as the schedule's 
effectiveness measure. Thereafter, to generate a random 
starting date for each section of each course and to deter- 
mine the cost of this schedule as given by the model; then 
by means of an heuristic procedure applied on that schedule 
trying to reduce the cost involved such that a local 
minimum cost schedule is found. Finally to use a sample 
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of such schedules to estimate the parameters of the Weibull 
distribution so that a decision can be made: whether to 
stop or to generate additional random schedules and to 
repeat the estimation of the parameters. 

This thesis is organized in four sections. In Section 
II is developed a mathematical description of the behavior 
of a school which must train personnel. In such a school 
the courses taught can be of different lengths and each one 
can be composed of one or more sections, depending upon the 
necessity of skilled personnel. It will not deal directly 
with the problem of a time-table of meetings within each 
week. It is assumed that each type of facility is available 
a fixed number of hours per vjeek and the number of trainees 
required in a year for a given course (or specialty) is 
divided into sections and they are scheduled to start at 
sometime during the year. It is this scheduling problem 
with which this thesis deals. 

Section III deals with elements of the theory of extreme 
values and the Weibull distribution. This distribution has 
the feature of being bounded and describes the behavior of 
the extreme values taken from samples Independently dravm 
from any parent population. 

In Section IV is proposed a heuristic cost reducing 
procedure and a stopping decision rule. Even though the 
procedure is rather slow in finding a local minima and v;as 
not used in the examples discussed in Section V , it is 
presented here because, in the author's opinion, it has 
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the essential characteristics required by an algorithm 
to be useful for this problem. 

In Section V the results of an example Involving a 
hypothetical school are presented and the findings discussed 
In Appendix A is presented the schedule similar to 
that proposed in [1] that has the feature of being balanced 
in the sense of facility usage by the sections of a given 
course throughout the complete period. 

The computer program for the scheduling algorithm and 
a brief description of it are included in Appendix B. The 
program was run on the IBM 360/67 system in the "W. R. 
Church" Computer Center at the Naval Postgraduate School, 
Monterey, California. 
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II. THE MODEL 



The major assumptions in the model are: 

a. Course curricular requirements are given in the 
course syllabus. 

b. Each unit of a specified facility is available a 
fixed number of hours per week. 

c. The total number of students requiring instruction 
is nearly constant and small changes can be absorbed 
by varying the size of the sections. 

d. A section is a physical group of students receiving 
a specified type of instruction. 

e. No restriction is placed in the order in which the 
courses are taught, l.e., no course has preference 
to any other course. Any section of any course is 
equally likely to start at any time of the period. 

A. FORMULATION OP THE MODEL AND NOTATION 

Let I = {i: 1,2,...,L} be the set of all different 
types of courses that are taught. Let N^^ be defined as 
the number of sections of course i which must be taught 
during the period of one year. Let = {s: 1,2,...,N^} 
be the set of sections of course 1, for all 1 c I. 

Define J = {j: 1,2,...,M} as the set of all types 
of facilities such as classrooms, laboratories and group 
of instructors. In the model j = 1, is reserved to 
designate the classrooms. 
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Let K = {k: be the set of all, working weeks 

in a specified period of time which is one year and let 

be defined as the number of units of facility j required 
in week k. 

Let d. be the duration in weeks of course i and A. . be 
1 • 10 

the total amount of facility required to teach course type 

i for all i e I and j e J, j 1. For example, A. . is the 

ij 

number of lab-hours of some laboratory or number of lectures 

t h 

or contact-hours required by the y group of instructors. 

We assume that d^ and A^^ are predetermined and are given 
as exogenous parameters to the model by the syllabus of 
each course. 

Sequencing requirements for individual topics or usage 
of laboratories within a determined type of course is not 
considered, and it is assumed that the total requirements 
for facility j by course 1 is evenly used among the weeks 
that comprise the duration of the course. Then it is possi- 
ble to define: 




as the weekly demand of facility j by one section of course 

i for all 1 £ I and all j e J, j 7 ^ 1. 

Let DA. be the weekly availability of one unit of 
0 

facility j, in hours, for all j e J. DA. is fixed for all 

J 

k e K. 
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We would like to set up a schedule such that all sec- 
tions of all courses are taught during the time period of 
interest and the costs involved are minimized. Since under 



the same conditions, the schedules should be the same from 
one year to another, if a section has not finished its 
instruction at the end of the 50^^ week it will be included 
at the beginning of the cycle for the time necessary to end 
its Instruction. 

t h 

Define k^^ as the starting date of the s^ section of 
course i, for all i e I and all s e N. , and 1 < k. < T. 
Note that there exist as many schedules as the number of 
distinct combinations of the since, for a given 

course the sections can always be relabeled the total number 
of different ways in which they can be scheduled is 

tNi-i 



Ni! 



Now for each section of course i there are 



tNi’-i 

N. ' ! 

1 



combinations among the sections of course 1’, for 1' = 
l,2,3,...jL, i’ 7 ^ i. Then the total number of different 
schedules is 



T exp[ T. N. - 1] 
1=1 ^ 



L 

n 

i=l 



( 1 ) 



Ni! 



Let be equal to 1 if section s of course 1 is in 

progress during the k^^ week and 0 otherwise. Then if 
k - k. <0, the section has not started yet and = 0; 

X o X o xv 

if 0 < k - k. < d. , the section is in house at week k and 

— IS 1 
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“^Isk “ if L the section has finished its 

instruction and = 0. Note that having defined in 

that way implies that 6., =6. ,.^= =6 =l 

isk isk+1 isk+di 

when k^g = k. This insures that once a section has started 
its instruction it will continue week after week until it is 
finished . 

Summing the for all s e in any k produces the 

number of sections of course 1 being taught during week k: 



N. 



s-1 



"^isk 



= ^k^ 



for k = 1,2, ... ,T 



( 2 ) 



Then ^ represents the amount in hours per week that 

facility j is required by all sections of course i during 
week k. Summing over all i e I will produce the total number 
of hours of facility j needed to accommodate all courses in 
week k, j 1. 

If a given facility is not to be used more than it is 
available in any week, then we have 

L 

''ij'^ik -^jk*^^- » J and kEK, 

Where is the number of units of j which must be provided 

in week k. 

The number of classrooms required depends on the total 
number of hours of meetings other than in laboratories 
occurring in a week. Let G be the set of Indices j not 
corresponding to laboratories and j / 1. 
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Then 



L 

Z Z will represent the total number of 

jeG i=l 

classrooras-hours needed or the total number of meetings 
other than in laboratories occurring in week k and must be 
less than or equal to total availability of 

classrooms-hours which must be provided in that week. 



Then 



L 

E E r..‘A., < M,. •DA,, for all k e K. 

JcO 1=1 1 



Henceforth, M. = maximum (M.^:k=l,2, . . . ,T) represents 
the number of units of facility j required in the year in 
order that the proposed schedule, defined by k. , k = 1,2,...,L, 
s = 1,...,N^, be a feasible one. 

B. COST STRUCTURE, AN OBJECTIVE FUNCTION 

Since in comparing two different schedules with the same 
number of sections and courses, the overall operational 
costs in a year will be essentially the same in both 
schedules, the objective will be to minimize the initial 
cost . 

Let 5 be the set of starting dates for all sections of 
all courses, which constitutes a schedule, thus 5 = 

{k. : all 1 £ I and all s e S.} . Given a particular 

IS 1 

schedule i the facility cost can be found from: 



M 

Z(5) = E M. • C, , (3) 

j=l ^ ^ 
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where is the initial cost of facility j for all J £ J. 

If j is a classroom or laboratory C. Includes elements such 

i) 

as equipment, buildings, offices, etc. In the present study 
the initial cost to obtain an instructor is assumed to be 
zero and therefore does not enter in the evaluation of a 
given schedule. A subjective cost could be attached to the 
Instructors if it was desired to study their influence in 
the model. must satisfy the follov/ing relationships: 

M. = maximum (M., :k = 1,2,...,T) for all j £ I 
3 0 ^ 



and 



L 

^ for all j e J and k e K 



L 

E E 
jeG i=l 



■! 

10 



^•1 < 
ik — 






M. ^ 1 and integer, 

J 

Aik defined by equation (2). 

The objective is to find that schedule C such that Z(c) 
is a minimum. It is this problem which we treat in this 
paper. We do not hope to find the minimum cost Z* exactly 
but only to estimate its value and find a schedule with 
Z(^) = Z*. This estimate v;lll be obtained by using the 
properties of the extreme-values and the V/eibull distribution. 
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III. EXTREME VALUES AND THE WEIBULL DISTRIBUTION 



A. A USEFUL DISTRIBUTION 

In this section some properties of the extreme values 
and the connection to solving the minimum cost scheduling 
problem will be discussed. Then, using the properties of 
the Weibull distribution, a procedure will be presented to 
estimate the minimum cost. 

A good presentation of the elements of the theory of 

extreme values is given by [5], here we only call attention 

to the fact that the lowest value in a random sample of 

size n, drawn from a parent population with cumulative 

distribution function F(') (given by = minimum (U^,U2, 

..., U^) where the U^ are sample values) is treated in [8] 

under the name Smallest Order Statistic. Weibull in [3] 

gives the cumulative distribution of Z . in its most 
° min 

general form 

F (z) = 1 - exp(-<|i(z)) (4) 

min 

where the function ({>(z) must have the following 

21 

characteristics ^ : 

i) Because in the minimal cost-scheduling problem there 
exists a true minimum cost such that the minimum value 
obtained from the sample cannot be below it, this true 
minimum cost represents a lower bound for the extreme- 
value distribution. Let ;; be this true minimum value, 
then 
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(z) = 0 



if z i; > 0 



^min 

= 1 - exp(-(|)(z)) if z > ^. 

ii) In addition ^{z) should behave as - c; ) , where 

B is a parameter that depends upon the extreme-value 
spread. B must be greater than zero if is a 

distribution function. 

One distribution that satisfies the conditions stated 
above is the Weibull distribution (in the form given in 
[ 2 ]): 

F (z) = 1 - exp(- (5) 

min ^ 



The probability that is less than or equal 

to a certain value z. 

The minimum cost criterion, 

A constant equal to the minimum cost of the 
population, 

A constant parameter Indicating the value of 
the variable such that the probability that 
is equal to or less than v is approximately 
0.63 (referred to as the characteristic smallest 
value in extreme-value theory), 

A constant parameter Indicating the shape of the 
distribution. The distribution will be positively 
skewed, symmetrical or negatively skewed, 
depending on whether k is less than, equal to, 
or greater than 3*259. 



where 

F(z) = 

^mln 

C 

V = 
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There is no strong theoretical support In using this 
distribution but empirical studies have shown the merits 
of its applicability [2,3]* The function has the advantage 
of requiring no knowledge of the parent distribution and 
having as one of its own parameters, the boundary value of 
interest . 

As has been suggested before, the minimum cost obtained 
after applying a suitable heuristic mechanism to a random 
value drawn from the parent population can be regarded as 
an extreme-value obtained from this same population. If 
the same procedure is repeated n times a sample of n random ' 
and independent observations of the extreme-value, i.e. of 
^mln have been obtained. It is then possible to 

estimate the value of the lowest cost facility mix-schedule 
by fitting the Welbull distribution to the data. 

The author is well aware that the minimum cost samples 
are discrete and trying to fit it to a continuous distri- 
bution will be a source of error. However, for a moderately 
large sample, the approximation will be close enough so as 
to accept the estimated lowest cost as a lower bound for 
any extreme-value obtained. As a general rough guide only, 
n ^ 30 is a reasonable range of values of n for applying 
approximations in most asymptotic results [8]. 

B. PARAMETER ESTIMATION OF THE WEIBULL DISTRIBUTION 

Let Z.^,Z.^,...,Z. ben independent and iden- 
min^’ min^ ’ ’^^^n 

tlcally distributed random variables. Let _< 
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2(2) — ••••j ^ ^(k) — **‘^(n) observations ordered 

according to their increasing size from the sample of 

minimum costs with cumulative distribution function as (5). 

Recall that the distribution of Z . should be discrete, 

min ’ 

then what is possible is that several different combinations 
of starting dates, l.e. schedules, will have the same cost 
value. Then since observations can be present with differ- 
ing frequencies, define 



(7 ^(l),(7 ^(2), ry ^(nl) 

^( 1 )’^ ’ ^( 2 )’^ 



(k) 

where 2^^^^ is defined as before, f is the corresponding 

t ll. 

frequency with which the k^ value occurs, and m is the 
number of different values that Z . takes. Then, according 
to [ 7 ] an unbiased estimator of the cumulative distribution 
function of the Welbull distribution is given by: 



m „(i) 

P(k) = F„(Z . < z., J = E ^-xT- 

Z min — (k) n+1 



and 



m 

E f 
i=l 



(i) 



n 



Having the data in the proper form and the estimate of 
the distribution function, the effort is now turned to the 
task of estimating the parameters of the VJelbull distribu- 
tion. Recall that this particular version of the distribu- 
tion has three unknown parameters. Although there are more 
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accurate methods of estimating these parameters [2], we 
will present a method based on the double logarithm trans- 
formation of the distribution function (5) as indicated in 
[2,3,7]: 



logClogd-P^) = klog(Z^^^-^) - klog(v-i;) (6) 



This is the equation of a straight line on Weibull 
probability paper. The slope of the line is the shape 
parameter k. Although the parameter v is not self-evident 
from the graph it can be easily estimated as it will be 
shovm later. This procedure or plotting can be repeated 
for different estimates of starting with c; = and 

lowering its value until the best straight line fit to the 
data is obtained by the least squares method. 

The derivation of the regression equations for esti- 
mating the Weibull parameters will be done for grouped 

( \r^ 

data. Let and f be defined as before for 

k = 1,2, ... ,m and the associated cumulative distribution 
be : 



p(k) . ? 



= E 
1=1 



n + 1 



(7) 



For simplicity define 



Y, = log(log(l-F^^^) k=l,2,..., 



m 



( 8 ) 
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m 

Y = - E Y, 
m k=l ^ 



Xj^ = logCZ^j^^ - c) k=l,2,...,m 



1 m 

X = ^ 2 ^1. • 

ni k=l ^ 



The regression equation as given In [9] Is: 



Yj^ = Y + b(Xj^-X) 



(9) 



where Yj^ Is the estimate Yj^ given Xj^ and b. The slope of 



the line Is given by 



b = 



m 

^ ^k ^k - 
k=l ^ ^ 



m m 
E Xk r Y 
k=l k=l ^ 

m 



m 

E X 
k=l ^ 



m 

( E Xj^)' 
2 k=l ^ 



m 



Writing the regression line In slightly different form 
Yj^ = bX + (Y - bX), 

and comparing it with (4) we have b as the value of k, the 
estimator of the shape parameter. In addition, since 

k ln(v - c) = Y - bX 



V, the estimator of the scale parameter is obtained from 
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'' Y — 

V = C + exp(j^ - X). 

Once the values of k and v have been obtained for a 
given determine the sum of squares of the differences 
between the values obtained by (8) using (7) and the values 
estimated by the regression equation, (9), i.e. 

SQR = z (Y, - Y, 

k=l ^ ^ 

/S . 

Repeat this step for different values of c such that 
c" < c’, until some value of say yields the minimum 
sum of squared differences about the regression line. The 
associated estimators of k and v, say k* and v^ are the 
estimated parameters of the V/eibull distribution and c* is 
the estimate of the lowest cost, i.e. the estimated value 
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IV. A HEURISTIC ALGORITHM 



Having developed the model and examined the theory of 
extreme values, the remaining steps are to present a 
heuristic cost reducing procedure and some kind of stopping 
decision rule. 

A. COST REDUCING PROCEDURE 

As stated before, once a random schedule has been ob- 
tained, a heuristic procedure designated to find a schedule 
whose cost is a local minimum can be applied to it. The 
following heuristic procedure is suggested for moving from 
a random schedule to another of lower cost, say ■ 

1. For each type of facility there is a maximum num- 
ber of units required by a given schedule. However, this 
maximum number of units is not required for every week of 
the year. Arrange the facilities in order of increasing 
number of weeks during which the maximum number of units is 
required. The order in which they will be selected in the 
searching process is given by the above ordering. In the 
case of a tie the higher priority should be assigned to 
that facility whose initial cost is higher. 

2. Beginning with the facility that has highest pri- 
ority, to determine which courses are related to it and 
order them by decreasing size of the amount of facility 
required. 
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3. The search continues by, successively, picking 

the course which requires the largest amount of the facility 

t h 

say 1, and, with the exception of the s^ section, fix the 

starting dates of all sections of all courses. Change the 

starting date for section 1, l.e., let k. = k for all 

X • s 

k = 1,2,...,T, and compute the cost of the modified schedule 
If for some value of k, this yields a cost which Is less 
than the previous one, retain this as a possible starting 
date for section s. 

4. Repeat the steps 1, 2, and 3 until no further 
Improvement Is obtained. 

The searching procedure possible can be shortened 
since It Is very unlikely that a facility which Is fully 
utilized more than fifty per cent or more can diminished 
by one unit. Consequently those facilities with usage 
factor greater than fifty per cent should not be taken Into 
account In the search procedure, neither should those whose 
maximum number of units Is one. 

B. A STOPPING RULE 

By using the above procedure we obtain a set of sched- 
ules whose cost Z , (5.') can be viewed as a random 

1 min 1 

sample of extreme values. These are, then, used to esti- 
mate c = Z . (?*) the minimum cost value that could be 
min 

achieved. When translated Into a computer program It Is 
necessary to Incorporate a rule such that after a number of 
trials a decision can be made whether to stop or to continue 
making more observations. 
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For Instance, the following rule could be used if the 
expected decrease in cost by improving the scheduling is 
greater than or equal to the cost of one unit of the least 
expensive laboratory, make additional trials; otherwise 
stop. Formally 



ZMM 
if / 



(ZMM-Z . (c))dF„ > CI( • ), 

^mln - 



continue searching. The number of additional trials q, is 
given by 



. ZMM 

, = ^ / (z„M-z^.^(0)dF2 , : 

c ^ min 

if 

ZMM 

/ (ZMM-Z^. (0)dF„ < CI(-), stop; 

^ min 



where ZMM is the lowest value obtained at the moment of 
decision, 

is the estimated value of the minimum cost, 

dF^ is the probability density function of the 
Weibull distribution, 

CI(‘) is the cost of one unit of the least expensive 
laboratory, and 

C is the cost os one additional trial, including 
computer cost, analyst cost, etc. 

It may still be in the designer's interest to examine 

additional decision parameters. Tvjo of those are extensively 

explained by McRoberts [2], Here we will give only some 
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illustration of the reasons v;hy they are of interest. 

These two parameters are the cumulative density function 

F„ and the value ZMM - The value of the shape param- 

^min 

eter k, tends to describe the skewness of the distribution, 

as illustrated in figure 1. Assuming that areas A and B are 

equal, the probability of improvement P„ may represent 

^min 

a very small absolute potential for a small k or may repre- 
sent a large band of Improvement for a bigger k respectively; 
then it may be desirable to identify regions ZMM - c* and 

F„ in the rejection range such that questionable zones 
^min 

may be evaluated more closely. To do so McRoberts [2] 
presents a plot of ZMM - against the probability of 
improvement . 

Figure 2 from [2] presents such a plot which is based 

on considering an equlcost curve as a function of F^ > 

'^min 

CI(*) and C and plotted in the F„ - (ZMM-c*) plane. By 

C u • 

min 

considering the equlcost curve in isolation, the threshold 

decision line may be represented as intersecting the C 

curve at the value that may also be considered a threshold 

value of the absolute range of improvement below 

which further search v/ill not be feasible [2]. 

Zone A. A cost value point falling in here leads to 
a decision for continuous searching. This 
region is analogous to the statistical Type 
II error. 

Zone B. Above threshold value, continue searching. 
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Figure 1. Effect of parameter k [2], 




Figure 2. Decision zones for continued search [2]. 



29 



Zone C. 



This zone lies below the threshold level to 
the left of the feasible absolute range. 
Therefore stop search. 

Zone D. Here the probability of improvement may be 
low, but the absolute range of improvement 
is large. This Implies continuing the search. 

Other decision parameters could be the Facility Utili- 
zation Factor (UFX) and the Usage Time factor. Both have 
been added to the computer program. Actually they are two 
measures of performance. The facility utilization factor 
UFX(*) is defined as the ratio between actual requirements 
in hours of facility (•) by all courses in the period in 
question and the total availability in hours of facility 
in the same period: 



r. . N. 

UFX(j) = - A 0 i UFX(j) < 1, 

J j 

for all j e J. 

The usage time factor PMAX is defined as the number of 
weeks that the facility is used at its maximum rate divided 
by T = 50 weeks. 

Four cases are immediately self evident: 

a. UFX small, PMAX small: poor scheduling. It may be 

possible to eliminate one 
unit of that facility and 
consequently its cost by 
searching for a better 
schedule . 
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b. UFX large, PMAX small: 



it Is very unlikely that this 
case can occur. If UFX is 
not so large and PMAX very 
small, it turns out to be 
the previous case. 

c. UFX small, PMAX large: that implies that it may be 

possible to make a better 
use of that facility by incre- 
menting the number of sections 
of the course or courses that 
make use of it. In the pre- 
sent conditions no improve- 
ments are obtained by further 
search . 

d. UFX large, PMAX large: Implies good employment of 

that resource. No further 
improvement may be possible. 
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V. SOME NUMERICAL EXAMPLES AND DISCUSSION 



The search procedure presented in the previous section 
turns out to be fairly time consuming on the computer, at 
least for research purposes. Hence it is necessary to make 
further investigations in that area as well as to refine 
the stopping rule if it is possible. 

In order to demonstrate the method in Sections HI and 
IV , some hypothetical problems were run by varying the 
number of sections of each course, without using the search 
procedure. The random sample was obtained by picking the 
least cost schedule out of a sample of ten and repeating 
this step n = 100 times. 

The most economical schedule found was compared against 
a proposed schedule similar to that suggested by Willingham 
[1]. A description of that schedule is given in Appendix A. 

Table 2, below, shows the results of one of those exam- 
ples. It was first assumed that the availability of instruc- 
tors is infinite, i.e. their initial cost was zero. That 
problem was compared with the results of the same schedule 
assuming a subjective Initial cost of assigning an instruc- 
tor to the school, equal for all types of instructors. In 
either case the proposed schedule was as good as or better 
than the most economical of the sample, even less than the 
estimated minimum cost inferred from the sample values. 
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other runs were made for the same problem, this time 
varying the seeds for the random schedules generator, 
obtaining the same results explained above. 

Table 3 shows the results obtained by varying the 
number of sections of each course. 

It is interesting to point out that no schedule was 
found with a cost less than the one proposed in Appendix 
A for each case. This suggest the possibility of starting 
with that schedule and then applying a search procedure 
seeking a minimum value. Of course this might not have 
been the case if the heuristic procedure had been used. In 
addition the sample size is extremely small compared to the 
parent population of all possible schedules as shown in (1). 

This thesis and the paper by V/illlngham both assumed 
that every section of every course used facilities evenly 
over its duration. The method in this thesis does not 
require that assumption to be made and could be easily 
modified to reflect more accurately the actual facility 
usages over time for each section. If this were done it 
is unlikely that a "balanced schedule" of the type described 
in the Appendix could so easily be found. Thus the schedule 
proposed in the Appendix would appear less favorable 
compared to the method described here. 
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TABLE 1 



Common data for a Hypothetical School 



course type 

duration 

Requirements 
of facility 

1 

2 

3 

ij 

5 

6 

7 

8 

9 

10 

11 

12 

13 

15 

16 

17 

18 

19 

20 
21 
22 > 

23 



1 

li| 



2 

10 



3 

23 



12 15 



5 

H 

10 



6 

3 

2 



23 18 



8 

8 



20 



18 



6 

18 



1 

2 

2 

7 



5 

12 

2 



7 

21 



5 

7 



7 

H 

8 

3 



8 

21 



2 

7 



7 
3 

8 

3 



Relative 
Initial 
cost of 
Facilities 

.5 

25.0 

175.0 

150.0 
8 i ^.0 
19.0 

55.0 

75.0 
11 75.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

1.0 

■ 10 1.0 
l ^^ 1.0 



Where G = (j: 2, 3, ^,5, 6, 7, 8, 9) 
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TABLE 2 



COURSE TYPE 123^56789 
NUMBER OP 

SECTIONS 4 3 16 10 6 20 8 8 5 

Problem 1 

Including a subjective instructor cost (* *): 

PROPOSED SCHEDULE: (in Appendix A) Cost l4l4.00 
MOST ECONOMIC: (from sample) 1422.00 

STATISTICAL INFERENCE: Estimated minimum cost -1421.98. 

Welbull, shape parameter .60574 

Problem 2 

Not including instructor cost: 

PROPOSED SCHEDULE: (by Appendix A) I 383 .OO 

MOST ECONOMIC: (from sample) I 383 .OO 

STATISTICAL INFERENCE: Estimated minimum cost 1379.86. 

Welbull shape parameter .66359 

(*) The differences between these two solutions was in 
the number of Instructors required. The number of 
units of each type of laboratory was the same. 

Since the problem is to minimize the resources 
required, the proper way to analyse the problem 
is to Include a cost of assigning an Instructor 
to the school. 
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TABLE 3 



Problem 3 



COURSE TYPE 123^56789 
NUMBER OF 

SECTIONS 84^1866 10 59 

PROPOSED SCHEDULE: (by Appendix A) Cost 
MOST ECONOMIC: (from Sample) 

STATISTICAL INFERENCE: Estimated minimum cost 

V/elbull shape parameter 



1039.00 

1042.00 

1041.93 

.70306. 
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Figure 3. Probability density function obtained from sample used in 
problem 3 for which the result is shown in Table 3. 



VI. CONCLUSION AND SUMMARY 



This study addressed the problem of obtaining a 
schedule in a training school for the minimum cost set of 
facilities required, to support a given training plan, 
based on the quantity of students in each type of course 
taught. The problem was approached by making a mathematical 
model of the scheduling-facilities interfacing, translated 
later into FORTRAN IV language. 

Making use of the theory of extreme-values, a 
sample of random minimum costs was obtained and this data 
was fitted to a Weibull distribution to estimate the 
minimum cost that it is possible to achieve. 

Further investigation is required in order to 
come up with an efficient search procedure that assures a 
fast and reliable algorithm such that a minimum cost schedule 
is obtained by successive use of it on a random schedule 
drawn from the population of all feasible schedules. 

The model without the heuristic procedure was 
tested by running several programs varying the number of 
sections required in a period of a year. In each case a 
proposed schedule (see Appendix A) was better than the most 
economical schedule based on a sample obtained from random 
schedules using the extreme- value theory, although this 
result would probably not hold if the courses had not been 
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assumed to use the facilities at a uniform rate over their 
duration . 

From the present results it appears that a good 
method would be to use a schedule of the type proposed in 
(1) and then perhaps to apply some improving procedure to 
that schedule. 
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APPENDIX A 



PROPOSING A SCHEDULE 



A similar schedule has been proposed in [1] with no 
claim of being optimal but with the intuition that the 
facilities are used more or less evenly throughout the 
period under study. 

Assume the planning period and the number of sections 
of each course to be taught in that period (usually a year), 
l.e. T = 50 weeks and N. 's for all i e I, are known. 

Define the interval betv'/een convenings for the sections 
of the 1 course by CS^ = [T/N^], largest Integer. Since 
the right-hand-side is seldom an exact integer define 

DIF = T - N.CS. - 1. 

1 1 

Assign to sections numbered 1 the starting date k. = 1 
for all i e I. Then the (N^ - 1) sections will be scheduled 
to start their training CS^ weeks apart, except for DIP 
sections that will be scheduled CS^ + 1 weeks apart; it 
does not matter which of the (N^ - 1) sections are selected. 
That will give a balanced schedule for each type of course 
in the sense of facility use. 



APPENDIX B 



THE COMPUTER PROGRAM 

The program has been written in FORTRAN IV for the IBM 
360/67 system at the Naval Postgraduate School. However, 
the program is self-contained and can be used in any other 
machine using FORTRAN IV programming language. 

The program has been made in subroutines for easy 
further improvements or changes. In general it is composed 
of: 

(a) SUBROUTINE RMDATE which generates the set of random 
starting dates as indicated below. 

Because of the assumption that any section of any 
course can start its instruction at any week of the 
year, let the random variable be the starting 
date of the s^ section of course i. Then the 
probability that be equal to = k is given 

by 



P(Kis = k^g) = i for all k e K. 

Then the cumulative distribution function is 



P(K. < k. ) 

IS — is 






k 

= ijT or 
if 1 < k < T 



0 



other\^^lse . 



4 





As we can see. the distribution of K. is the 
’ is 

discrete version of the uniform distribution. Then 
to generate a random schedule it is only necessary to 
use the following modification to any uniform random 
number generator [0,1]: 

K. = [T • RN + 0.5], largest Integer, 

X s 

where 

K. is the generated (random) starting date of 
^ ^ 1 

the s^ section of course i, 

T is the length in working weeks of the period 
that has been considered; T = 50 weeks per 
year, 

RN is the uniform random number in the Interval 

[ 0 , 1 ]. 

(b) SUBROUTINE COMPUT which computes, according to the 
model developed in Section I, the following values: 

1. Number of units of each type of facility 
required to meet the specified program of 
training. 

2. Number of weeks in which the above are actually 
required. 

3. The total usage, in hours per year, of each 
facility by all courses. 

4. The least cost schedule (ZMM) of the sample. 
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(c) The search procedure is incorporated to the main 
program by means of several subroutines: 

1. SUBROUTINE SELECJ which arranges the facilities 
according to the procedure described in 
Section IV. 

2. SUBROUTINE SELECI which selects a course. 

3. SUBROUTINE REVAL which puts up-to-date the 
values obtained by SUBROUTINE COMPUT after a 
change on one starting date has been done. 

(d) SUBROUTINE WEIBULL which computes the estimates 
Indicated in Section III; namely: 

1. ^ location parameter or minimum cost estimate. 

2. k shape parameter. 

3. V scale parameter. 

(e) SUBROUTINE RULE which decides whether to stop or to 
continue searching. This part of the program makes 
use of the IBM SUBROUTINE QG9 (integration of a given 
function by the Gaussian quadrature method, nine 
points formula) which has been added under the name 
SUBROUTINE RULE. This subroutine also determines 
how many additional trials should be done, if any. 

One of the features of the program is that one or more 
predetermined schedules can be tested in case there exist 
constraints in the scheduling period. 

The program also gives a write-up of each tested schedule 
and the most economical schedule, including characteristic 
values as utilization factors, usage percentage and v;eekly 
requirements of facilities in units of facilities. 



All data is fed to the program by means of DATA STATE- 
MENTS at the beginning of it, except course requirements 
and the proposed schedules that are fed by means of Data 
Cards. The arrangement of the data cards is as follows: 

1. The first card has the value of two program control 
parameters, ST and SNO. ST Indicates the number of 
proposed schedules that the programmer or decision- 
maker wishes to test. SNO indicates whether search 
procedure is desired or not. 

2. The next set of cards are course requirements, one 
for each type of facility. 

3 . Finally are the proposed schedule cards. Each card 
contains the section’s starting dates of one course. 



COMPUTER PROGRAM LISTING 
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READ IN PROGRAM CONTROL INDICATORS 
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ALL STARTING POINTS FOR A GIVEN SET OF SECTIONS ARE NOW DETERMINED 
START COMPUTATIONS 
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COMPUTE AGAIN COST AND CHARACTERISTICS OF MOST ECONOMIC SCHEDULE 
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